AITopics | oracle-efficient pac rl

Collaborating Authors

oracle-efficient pac rl

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On Oracle-Efficient PAC RL with Rich Observations

Neural Information Processing SystemsNov-20-2025, 22:17:41 GMT

We study the computational tractability of PAC reinforcement learning with rich observations. We present new provably sample-efficient algorithms for environments with deterministic hidden state dynamics and stochastic rich observations. These methods operate in an oracle model of computation -- accessing policy and value function classes exclusively through standard optimization primitives -- and therefore represent computationally efficient alternatives to prior algorithms that require enumeration. With stochastic hidden state dynamics, we prove that the only known sample-efficient algorithm, OLIVE, cannot be implemented in the oracle model. We also present several examples that illustrate fundamental challenges of tractable PAC reinforcement learning in such general settings.

name change, oracle-efficient pac rl, rich observation, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.53)

Add feedback

Reviews: On Oracle-Efficient PAC RL with Rich Observations

Neural Information Processing SystemsMay-26-2025, 06:47:09 GMT

Moreover, the reward depends only on x_t and the action, not the state S_t. They then correctly state (again, lines 99-100) that this makes the problem an MDP over X. It argues "The hidden states serve to introduce structure to the MDP and enable tractable learning." I don't understand why this is the case.

assumption, oracle-efficient pac rl, value function, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.52)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.51)

Add feedback

On Oracle-Efficient PAC RL with Rich Observations

Dann, Christoph, Jiang, Nan, Krishnamurthy, Akshay, Agarwal, Alekh, Langford, John, Schapire, Robert E.

Neural Information Processing SystemsFeb-14-2020, 08:11:51 GMT

machine learning, oracle-efficient pac rl, reinforcement learning, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)

Add feedback